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Abstract 


This  work  is  concerned  with  generalized  convex  programming  problems, 
where  the  objective  and  also  the  constraints  belong  to  a  certain  class  of  convex 
functions.  It  examines  the  relationship  of  two  conditions  for  generalized  convex 
programming — self-concordance  and  a  relative  Lipschitz  condition — and  gives 
an  outline  for  a  short  and  simple  analysis  of  an  interior-point  method  for  gen¬ 
eralized  convex  programming.  It  generalizes  ellipsoidal  approximations  for  the 
feasible  set,  and  in  the  special  case  of  a  nondegenerate  linear  program  it  es¬ 
tablishes  a  uniform  bound  on  the  condition  number  of  the  matrices  occurring 
when  the  iterates  remain  near  the  path  of  centers.  — — — 


Key  words:  convex  program,  ellipsoidal  approximation,  relative  Lipschitz 
condition,  self-concordance 


INTRODUCTION 

In  earlier  papers,  Jarre  [4,  5],  Mehrotra  and  Sun  [9],  and  Nesterov  and  Nemirovsky 
[11,  12]  tried  to  find  a  rather  general  class  of  convex  programs  that  can  be  solved 
by  interior- point  methods.  These  authors  use  logarithmic  barrier  functions  in  their 
algorithms.  Jarre  and  Mehrotra  and  Sun  have  imposed  certain  conditions  on  the 
constraint  functions  /,,  while  Nesterov  and  Nemirovsky  require  the  barrier  function 
to  be  self- con  cord  ant.  In  all  cases  the  conditions  guarantee  that  Newton’s  method 
for  minimizing  the  barrier  function  converges  with  a  fixed  rate  of  convergence. 

When  summarizing  and  relating  some  of  the  above  results  here,  we  attach  great 
importance  to  the  underlying  geometry  and  structure  of  the  method.  To  date,  a 
large  variety  of  interior-point  methods  and  search  directions  have  been  suggested,  all 
of  which  follow  the  same  two  components:  centering  and/or  progress  in  the  objective 
function.  For  the  sake  of  clarity,  only  the  method  of  centers  is  examined  in  detail  to 
illustrate  the  geometry  that  is  shared  by  all  these  methods  and  to  form  a  foundation 
on  which  any  of  these  methods  can  easily  be  analyzed.  A  short  outline  of  how  to 
derive  a  practical  algorithm  from  the  results  presented  here  is  given  in  Section  2.7. 

'This  work  was  supported  by  a  research  grant  from  the  Deutsche  Forschungsgemeinschaft.  and 
in  part  by  the  U.S.  National  Science  Foundation  Grant  DDM-87151.r>3  and  the  Office  of  Naval 
Research  Grant  N00Q H-90- J- 1242. 

*On  leave  from  Institut  fiir  Angewandte  Mathematik,  University  of  Wurzburg,  8700  Wurzburg, 
(Westl  Germany 


1.  PROBLEM  AND  CONDITIONS 

The  problem  under  study  is  to  find 


A*  :=  min{/0(x)  |  x  E  P  }, 


where 

P  :=  {x  e  lRn  |  fi{x)  <  0  for  1  <  i  <  m),  (1.0) 

and  the  /,  E  C2(P)  are  convex  functions  that  fulfill  certain  conditions  specified  in 
Subsection  1.2.  The  first  and  second  derivatives  of  /,(x)  will  sometimes  be  referred 
to  as  a  row  vector  Dfi(x)  and  a  square  matrix  Z)2/,(x),  and  sometimes  as  a  linear 
form  Dfi(x){.]  and  a  symmetric  bilinear  form  Z?2/,(x)[.,.]. 

For  the  sake  cf  simplicity  we  assume  that  the  interior  of  the  feasible  set  P  is 
nonempty  and  bounded.  Given  a  point  y  in  the  intersection  of  the  domains  of  the 
functions  /,,  one  can  use  a  phase  1  algorithm  as  in  the  appendix  of  [3]  to  guarantee 
this  assumption. 

1.1.  Possible  Conditions  on  the  /, 

1.1.1.  Self-concordance 

The  most  general  condition  is  given  in  Nesterov  and  Nemirowsky  [11],  requiring  that 
the  barrier  functions  v?,(x)  :=  -ln(-/,(x))  are  a-self-concordant  on  the  interior 
P°  of  P  for  1  <  i  <  m.  Likewise,  for  A  >  A*,  the  function  <p>o(x,  A)  :=  -  ln(A-/o(x)) 
is  required  to  be  self-concordant  on  P°. 

Definition  (self-concordance) 

Here,  in  slight  variation  to  the  definition  of  [12],  a  function  <p  :  P0  — ►  52  is  called 
self-concordant  on  P°  with  parameter  a  (in  signs:  y?  6  5a(P°))  if  is  three  times 
continuously  differentiable  in  P°  and  if  for  all  x  E  P°  and  all  h  E  52n  the  following 
inequality  holds: 

| D3<p(x)[h,h,h}\  <  2y/a(D2<p(x)[h, h])3'2.  (1.1) 

Intuitively,  large  values  of  a  imply  that  the  third  derivative  may  be  large,  i.e.  that 
<p  cannot  be  well  approximated  by  a  quadratic  function.  Clearly,  linear  or  convex 
quadratic  functions  fulfill  (1.1)  with  a  parameter  q  =  0  on  IRn .  However,  we  note 
that  condition  (1.1)  is  not  applied  to  the  constraint  functions  /,  themselves,  but  to 
the  associated  barrier  functions  (which  are  not  linear  or  quadratic,  even  if  the  /, 
are  so). 

For  the  sum  <p(x)  =  the  following  property  is  also  required  in  [12]. 

Definition  (strong  self-concordance) 

A  function  <p  :  P°  — *•  R  is  called  strongly  a-self-concordant  (in  signs:  <p  E  S+ ( P° ) ) 
if  it  is  a-self-concordant  and  if  the  level  sets  {x  €  P°\<p(x)  <  t}  are  closed  in  IRn  for 
all  /  £  52 

intuitively  this  means  that  <^(x)  goes  to  infinity  as  x  approaches  the  boundary 
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of  P°,  a  condition  that  is  naturally  fulfilled  if  the  function  p>  is  a  penalty  function 
defined  as  above,  but  may  not  hold  for  a  general  self-concordant  function. 

Remark  1  (Proposition  1.1  in  [12]) 

The  concept  of  self- concordance  is  affinely  invariant  in  the  following  sense.  If  A  is 
an  invertible  affine  mapping,  A  :  !Rn  — *  ZR",  and  <p  is  an  a-self-concordant  function, 
i.e.  v5  €  Sa(P°),  then  the  function  defined  by  ip(x)  :=  ip(A~'x)  is  again  a-self- 
concordant,  ip  e  sa(AP°). 

Further  the  following  simple  rule  for  addition  and  scaling  of  self-concordant 
functions  holds.  If  the  functions  <p,  are  self-concordant  with  parameters  a,  on  the 
domains  P°  for  i  =  1,2  and  if  r,  are  positive  real  numbers,  then  the  function 
V5  :=  ri<Pi  +  riV2  is  a-self-concordant  with  a  =  maxfai/r^aa/^}  on  the  domain 
P°  :=  Pf  n  PI 
Proof:  Straightforward.  | 

Note  that  Remark  1  also  holds  for  the  property  of  strong  self-concordance. 

1.1.2.  Relative  Lipschitz  condition 

The  same  motivation  as  above,  having  a  function  which  is  close  to  a  quadratic 
function,  has  led  to  the  definition  of  the  following  Relative  Lipschitz  Condition  in 
Jarre  [5].  (See  also  [2].)  The  functions  /,  (0  <  i  <  m )  are  supposed  to  be  continuous 
on  P  and  twice  continuously  differentiable  functions  on  jP°,  with  Hessian  matrices 
P2f,  fulfilling  the  Relative  Lipschitz  Condition, 

3M  >  0  :  V*  €  Rn  Vy  €  P°  Vh  with  ||h||„i(y)  <  0.5/(l  +  Af1'3) 

\zT(D2f,(y  +  k)~  D2 fi(y))z\  <  M\\h\\Hi(y)zT D2 (1.2) 

which  bounds  the  relative  change  of  D2  f,  in  neighboring  points  y  and  y  +  h  for  small 
||h||//t(y).  Here,  ||  ||tfi(v)  is  a  certain  semi-norm  that  makes  (1.2)  affine  invariant  and 
is  specified  below.  Again  it  is  obvious  that  linear  or  convex  quadratic  functions 
/,  fulfill  condition  (1.2)  with  M  =  0.  (This  condition  is  applied  to  the  constraint 
functions  /,  directly!)  The  precise  definition  of  the  matrix  Z/,(y)  and  the  associated 
semi-norm  is  given  by 

and  :=  hT Hi{y)h.  The  matrices  Hi(y)  arise  as  the  Hessians  of  the  logarith¬ 

mic  barrier  functions.  As  shown  below,  the  norm  given  by  H{y)  ]C]+i  Hi{y)  's 
closely  related  tc  the  shape  of  the  feasible  set  P,  and  is  a  very  convenient  measure 
for  analyzing  Newton’s  method.  Clearly  H,{y)  is  positive  semidefinite,  since  fi  is 
assumed  to  be  convex.  Note  that  condition  (1.2)  requires  that  D} /,(y  ;  h)  exists 
for  all  h  with  ||h||//,(y)  <  0.5/(l  +  M 1^3). 

Remark  2 

If  condition  (1.2)  holds  for  the  function  D2  fi  at  a  point  y  €  P°  (i.e.  fi(y)  <  0)  then 
also  fi(y  +  h)  <  0  for  all  h  with  J|*li//.(V)  <  0.5/(l  -F 
Proof:  See  Appendix.  | 


PTf,(y)Pf,(y) 

f?(y) 
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Oiven  a  strictly  feasible  point  y,  i.e.  a  point  y  such  that  fi(y)  <  0  for  all  1  <  i  < 
m,  condition  (1.2)  is  only  needed  for  points  y  +  h  with  ||/i||//,(y)  <  0.5/(l  +  M1^3) 
for  all  1  <  i  <  m.  Remark  2  guarantees  that  also  /,(y  +  h)  <  0  for  all  i.  Hence 
y  +  h  G  P°,  so  that  in  fact  condition  (1.2)  is  needed  only  for  points  y,  y  +  h  6  P°. 

Example 

The  Relative  Lipschitz  Condition  allows  certain  singularities  on  the  boundary  of 
P,  the  second  derivative  of  the  function  /  :  JR  — *■  JR,  x  — ♦  -y/x  e.g.  fulfills  the 
condition  with  M  =  8  on  P  :=  {*1*  >  o}. 

1.1.3.  Relationship  between  the  relative  Lipschitz  condition  and  self¬ 
concordance 

Loosely  speaking,  the  Relative  Lipschitz  Condition  is  sufficient  for  the  resulting 
barrier  function  to  be  self-concordant.  More  precisely  one  can  state  the  following. 

Lemma  1 

If  the  second  derivative  D2f  of  a  convex  function  /  fulfills  the  Relative  Lipschitz 
Condition  (1.2)  (for  infinitesimal  ||/i|()  on  the  domain  Pj  :=  (x  | f(x)  <  0}  and  if  / 
is  three  times  continuously  differentiable  on  Pj,  then  the  barrier  function  v?(x)  := 
-ln(-/(x))  is  a-self-concordant  on  Pj  with  the  parameter  a  =  (1  +  A/)2. 

Proof:  See  Appendix.  | 

The  converse  of  Lemma  1  is  not  true;  there  even  exist  non-convex  functions 
/  whose  barrier  functions  ip(x)  :=  -ln(-/(x))  are  a-self-concordant  (and  hence 
convex)  on  Pj  (see  Subsection  2.7.  “Extensions”).  The  idea  of  self- concordance 
and  Relative  Lipschitz  condition  however  are  closely  related,  and  as  the  following 
two  statements  show,  self-concordance  in  fact  is  equivalent  to  a  modified  Relative 
Lipschitz  condition.  Lemma  2  is  taken  from  [12]. 

Lemma  2  (Theorem  1.1  in  [12]) 

Let  yj  be  strongly  a-self-concordant,  <p  €  S+iP0),  and  let  a  strictly  feasible  point  y  G 
P°  be  given,  and  h,z  G  lRn.  Define  H(y)  :=  D2<p(y),  S  :=  yJhTH(y)h  =  ||/i||//(y) , 
and  x  :=  y  +  h. 

Then  the  following  is  true:  If  6  <  l/y/a,  then 

x  =  y  +  h  G  P° 

and 

(1  -  v/a5)||z||//(y)  <  ||z||//(r)  <  —  _-^S)\\z\\„M. 

Proof:  See  Appendix.  | 

For  6  <  one  has  <  1  +  6 and  thus  Lemma  2  implies  that 

| zT{D2<p(y  +  h)~  D2<fi(y))z\  <  ^y/a\\h\\H(y)zT D2^{y)z 

(cf.  [3]  (Lemma  2.1,  equivalence  of  the  H- norms)).  Hence,  a  self-concordant  barrier 
function  also  fulfills  a  Relative  Lipschitz  condition,  where  the  norm  of  the  vector 
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h  is  measured  by  D2p>{x)  directly  (and  not  by  D 2  —  ln(— <p{x))  ).  Conversely,  it  is 
easy  to  show  the  following 

Remark  3 

Let  the  function  <p  fulfill  a  Relative  Lipschitz  condition  of  the  following  form  (with 
the  notation  of  Lemma  2): 

|zT( D 2<p{x  +  h)~  D2<p(x))z\  <  0\\h\\H(x)ZTD2<p(x)z. 

Then  ip  is  self-concordant  with  parameter  a  =  /32/4. 

Proof:  See  Appendix.  | 

1.1.4.  Curvature  constraint 

Mehrotra  and  Sun  [9]  do  not  need  the  continuity  of  the  second  derivative  of  the 
functions  /j,  but  only  a  curvature  constraint  of  the  form 

3k  >  1  :  V/i  6  IRn  Vx,t/  (E  P  :  0  <  n~2hT D2 fi(y)h  <  hTD2f,(x)h  <  k2/it D2 ft(y)h. 

With  this  condition  they  can  show  the  same  result  as  Jarre  and  Nesterov  and  Ne- 
mirovsky  for  their  algorithms.  However,  since  in  the  above  form  the  curvature  con¬ 
straint  excludes  linear  or  semidefinite  quadratic  functions  /,,  as  well  as  singularities 
on  the  boundary  of  P,  we  will  not  use  this  condition  here. 

1.2.  Further  Assumptions 

In  the  following  we  will  assume  that  the  functions  -  In(-/,(x))  are  self-concordant 
with  parameters  a,.  Note  that  (by  Lemma  1)  the  logarithmic  barrier-functions  ip, 
of  linear  and  convex  quadratic  functions  /f  are  1-self-concordant,  and  so  is  their 
sum  p>  -  P>{x)  (by  Remark  1).  Thus,  the  following  analysis  includes  linearly 

or  quadratically  constrained  convex  programming  as  a  special  case  with  a  =  1. 

Without  loss  of  generality  we  further  assume  that  /o  is  linear.  (Otherwise  we  may 
introduce  an  additional  variable  xn+i>  an  additional  constraint  /m+i(x,zn+i)  := 
/o(x)-xn-H  <  0,  and  minimize  xn+\.  Note,  that  for  this  construction  the  new  func¬ 
tion  -  ln(-/m+i)  must  be  self-concordant  on  the  domain  {(x,z„+i)|a;  €  P°,  fo{x)  < 
Xn+i}.  In  a  practical  implementation  such  a  construction  may  increase  the  condi¬ 
tion  numbers  of  the  Hessians  considered  in  the  algorithm.) 

Note  that  by  construction  the  resulting  function  <p(x )  =  *s  strongly 

self-concordant  on  P°. 
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2.  PROPERTIES  AND  A  SIMPLE  METHOD 


For  A  >  A*,  let  P( A)  denote  the  feasible  set  P  constrained  by  the  additional  inequal¬ 
ity  /o(x)  =  cTx  <  A: 

P( A)  :=Pn{x|  f0(x)  <  A}. 

The  method  outlined  in  this  section  follows  a  homotopy  path  A  :  oo  — >  A*  of  some 
interior  point  x(A)  in  P(A).  Here,  x(A)  is  chosen  as  the  well  known  analytic  center 
of  P( A)  (Sonnevend  [14]). 


2.1.  The  Analytic  Center 

For  each  parameter  oo  >  A  >  A*  the  analytic  center  x(A)  of  P(A)  is  defined  as  the 
unique  point  x  in  P(A)°  minimizing  the  strictly  convex  logarithmic  barrier  function1 
V- 

V>(*,A)  :=  -?ln(A  -  /o(x))  -  £>(-/,(*))  (2.1) 

t=i 

with  some  fixed  q  £  IN  (the  positive  natural  numbers)  and  P(A  =  oo)  :=  P.  In  this 
paper  only  the  choice  q  =  m  is  considered;  the  modification  to  other  values  of  q  is 
straightforward.  The  analytic  center  depends  smoothly  on  all  constraints,  also  on  A, 
and  as  the  following  analysis  shows  it  can  be  efficiently  approximated  by  Newton’s 
method.  The  strict  convexity  of  p  follows  immediately  from  the  boundedness  of  P 
and  the  strong  self- concordance  of  p  on  P( A)°. 

The  analytic  center  x(A)  also  maximizes  the  concave  function  of  x 


¥(*,A):= 


(A  -  fo{x))q  !!(-/.(*)) 


1=1 


l/(m+cj) 


(2.2) 


over  P.  One  may  interpret  (2.2)  as  x(A)  maximizing  the  product  of  the  ‘distances’ 
to  the  constraints  /,(x)  <  0. 

Proof  of  concavity  of  (2.2):  see  appendix.  | 

The  analytic  center  x(P)  of  a  set  P  (or  of  the  set  P(  A))  is  invariant  under  affine 
transformations  of  P  in  the  sense  that  an  invertible  affine  transformation  A  :  IRn  — 
Rn  applied  to  the  set  P,  P  — ►  AP  =  (x|/(,4_1x)  <  0}  also  maps  the  analytic  center 
x(P)  =  argrnax[n,=i(-/,(x))]1/m  to  Ax(P)  =  argmax[n,=l(-/,M“1*))]1/ni  = 
x1  AP).  It  is  also  invariant  under  scaling  of  the  functions  /,. 

The  function  ip  in  (2.1)  is  o-self-concordant  on  P°(A)  if  the  functions  ln(- },{x)) 
are  a, -self  concordant,  and,  according  to  remark  1,  a  =  max{A,  q,  i<,<m}.  Hence, 
for  linear  or  quadratic  /,  we  have  a  =  1  (by  Lemma  1). 

'The  function  v>(x.  A)  in  (2.1)  defines  the  analytic  center  of  P(  A).  For  brevity  we  will  also 
sometimes  deal  with  the  function  In ( —  f,(x))  defining  the  analytic  center  of  P.  Similarly 

with  H(x)  :=  D2<?(z)  and  H(x,X)  :=  !P<p(x,  A).  Results  for  y>(r)  and  H (x)  are  applied  later  to 
<^(x,  X)  and  H{x,  A). 
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2.2.  Ellipsoidal  Approximations  of  P 

If  the  feasible  set  P  is  bounded,  then  the  semi-norm  in  Lemma  2  is  in  fact  a  norm 
that  is  closely  related  to  the  geometrical  shape  of  P.  Lemma  2  already  stated  an 
inner  ellipsiodal  approximation  of  P;  for  any  point  x  £  P°  the  point  x  +  h  £  P°  if 
h  belongs  to  the  ellipsoid  defined  by 

IHI//(x)  <  l/\/« 

where  H(x)  =  D2<p{x).  Furthermore  one  can  show  the  following  outer  ellipsoidal 
approximation  of  P  centered  at  its  analytic  center. 

Lemma  3  (cf.  [5]  Corollary  2.15) 

Let  x  be  the  analytic  center  of  P  and  h  £  lRn  be  arbitrary  with 


\Mh(x)  >  1 6\/ann. 


Then  x  +  h  g  P.  Proof:  See  Appendix.  | 

This  two-sided  ellipsoidal  approximation  of  the  feasible  set  P  around  its  analytic 
center  has  been  shown  in  [14]  for  the  linear  case  and  in  [15,  3]  for  quadratic  /,  (see 
also  [5]).  It  relates  the  matrix  H  to  the  shape  of  the  set  P.  In  the  next  subsection  we 
will  show  that  the  underlying  norm  ||.||//  is  also  suitable  when  analyzing  Newton’s 
method. 

2.3.  Newton’s  Method 

In  the  following  we  will  give  a  proof  of  quadratic  convergence  of  Newton’s  method  for 
approximating  the  analytic  center  x  of  a  set  P  and  give  explicit  constants  (depending 
only  on  a)  that  describe  the  speed  of  convergence.  Here  all  “distances”  are  measured 
in  the  /f-norm  and  related  to  the  concordance  parameter  a.  Lemma  4  has  been 
proved  in  modified  form  in  [12]  and  states  that  if  a  Newton  step  for  finding  the 
center  is  small,  then  Newton’s  method  converges.  Conversely,  Remark  4  guarantees 
that  if  a  point  y  is  sufficiently  “close”  to  the  analytic  center  x  of  P,  then  again 
Newton’s  method  converges.  Recalling  some  notation,  the  Newton  step  h(y)  starting 
at  y  for  finding  the  analytic  center  x  is  given  by  h(y)  =  -H~1(y)Dip(y)T ,  with 
H(y)  =  D2<p(y). 

Lemma  4  ([12],  Theorem  1.3;  quadratic  convergence  with  constant 
Let  ip  be  a  strongly  a-self-concordant  function  defined  on  a  nonempty  bounded  set 
P°.  For  a  point  y  £  P°  define  H(y)  :=  D2<p(y),  g(y)  :=  D<p(y)T.  Let  h  :=  h(y)  = 
-H (y)~'g(y)  be  the  Newton  step  starting  at  y  for  finding  the  analytic  center  x  of 
P,  let  h  be  the  following  Newton  step  starting  at  y  +  h,  and  define  the  lengths  of 
the  Newton  steps  by  6  :=  ||^|[//(j,)  and  6  :=  ||h||//(y+/l).  If 
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then  y  +  h  is  feasible,  y  +  h  E  P°  and  the  length  6  of  the  following  Newton  step  is 
of  order  82: 


6  < 


62. 


For  6  <  2^  this  implies  convergence  of  Newton’s  method,  and  for  6  <  47^  U 

implies  that  8  <  ^ y/a8 2. 

Proof:  See  Appendix.  | 

The  importance  of  this  lemma  is  that  the  constant  ^y/ot  can  be  explicitly  stated, 
depending  only  on  a  and  not  on  the  data  of  the  functions  /,. 

For  suitably  damped  Newton  steps  it  is  also  shown  in  [12]  that  a  fixed  rate  of 
convergence  holds  for  the  case  6  >  1/4 y/a. 

In  the  following  subsections  the  /esuit  above  will  be  applied  to  the  function 
y?(x,A)  (and  h{x,X),H(x,X))  to  analyze  a  “short-step”  method  for  following  the 
path  of  centers.  A  “long-step”  method  for  convex  constraint  functions  whose  Hes¬ 
sians  fulfill  the  Relative  Lipschitz  condition  is  analyzed  in  [2]. 


Remark  4 

In  the  notation  of  the  previous  lemma,  the  following  statement  holds:  If  the  length 
6  =  ||/i||//(j,)  of  the  Newton  step  h  starting  at  y  for  finding  the  analytic  center  x 
of  P  fulfills  <5  <  then  the  “distance”  from  y  +  h  to  x  is  of  the  order  <52;  more 

precisely.  \\y  +  h  -  x\\ff^  <  \y/a82. 

Proof:  See  Appendix.  | 

These  properties  show  that  the  length  of  the  Newton  step  (in  the  //-norm)  is  a 
measure  for  the  closeness  to  the  center  that  can  be  used  to  analyze  a  method.  In 
this  context  let  us  state  two  further  remarks  that  are  not  needed  for  the  analysis 
here  but  may  be  interesting  for  step-length  control  in  a  numerical  implementation. 

Remark  5 

The  //-norm  of  the  Newton  step  h(y)  starting  at  a  point  y  E  P°  for  finding  the 
center  x  of  P  is  uniformly  bounded  for  any  y  E  P°  by  ||h(j/)||//(j,)  <  y/m.  This 
bound  does  not  depend  on  a.  A  similar  observation  is  made  in  Proposition  3.5  in 
[12].  There  are  examples  where  ||h(2/)ll//(y)  €  0(y/m)\  see  e.g.  [13]. 

Proof:  See  Appendix.  | 

Remark  6 

Let  some  e  with  0  <  '  <  |  be  given.  If  a  point  y  satisfies  ||y  -  x|j//(*)  <  then 
y  lies  in  the  domain  of  quadratic  convergence  of  Newton's  method  and  the  Newton 
successor  y'  =  y  -  //(y)_1/i(y)  satisfies 


This  is  particularly  interesting,  since  by  Lemma  3  the  whole  set  P  is  contained  in  a 
fixed  “multiple”  of  this  domain,  namely  P  C  {j/|  ||y  -  x||//(.r)  <  lGm^/o}. 

Proof:  See  Appendix.  | 
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2.4.  Short-Step  Algorithm 

Below,  a  short  step  algorithm  is  stated.  This  algorithm  is  too  slow  for  a  practical 
implementation.  Its  rate  of  convergence  however  ensures  polynomiality  in  the  r  isc 
of  a  linear  program  (since  the  exact  solution  can  be  rounded  from  a  sufficiently 
accurate  approximation  [1]).  Further,  the  same  rate  of  convergence  can  be  guar¬ 
anteed  for  a  convex  program  with  constraints  whose  logarithmic  barrier  functions 
are  strongly  self-concordant.  Possible  acceleration  techniques  for  the  algorithm  that 
are  based  on  the  theoretical  results  developed  here  are  outlined  in  Subsection  2.7. 
Implementations  are  discussed  e.g.  in  [6,  8,  10,  2]. 

Under  the  assumptions  of  Sections  1  and  1.2  let  a  point  j/o  G  P°  and  some 
number  Ao  >  A*  be  given  such  that  the  first  Newton  step  /i(yo.Ao)  <  l/(F,/a). 
Simple  modifications  of  the  algorithm  to  generate  such  a  point  j/o  and  An  are  omitted 
here  (see  e.g.  [3]).  Again,  the  objective  function  /o(cr)  is  denoted  by  cTx. 


Algorithm 

1.  A  :=  0;  a  :=  \ /(R^/Tily^o ):  (  =  desired  accuracy. 

2.  >Jk+ 1  :=  >Jk  +  hi,  where  /t*  :=  /t(A*,Ajt)  =  ~D2^(yk,  Afc)~ 1  D<f(  j/fc.  A*) r. 

3.  If  A*  -  cryk+ 1  <  stop. 

4.  Ajt+i  :=  A*,  -  o{Xk  -  cTyk+\). 

5.  A-  :=  A  +  1;  go  to  1. 

2.5.  Convergence  Analysis 

In  order  to  c  'sure  convergence2  of  the  algorithm  the  following  two  properties  are 
showui. 

First,  after  the  update  of  \k+\  in  step  4.  the  iterate  yk+ j  again  satisfies 

il^(itA-+i , Afc+i )||//(yt+1  ,A*+t )  <  l/(4\/ft)-  (2.3) 

This  guarantees  that  the  iterates  remain  feasible  and  close  to  the  center. 

Second, 

A k  -  cTyk- ft  >  ^(cryk+\  -  A*),  (2.4) 

so,  that  the  stopping  criterion  in  step  3  is  exact,  and  the  gap  A*-  -  A*  in  between 
the  upper  bound  A*  for  c7  yk  and  the  (unknown)  optimal  value  A'  is  reduced  by  a 
factor  of  at  least  0.4a  in  step  4. 

Proof:  See  Appendix.  | 

This  completes  the  proof  of  convergence! 

2 Only  feasibility  and  convergence  of  the  objective  function  value  crt/t  to  the  optimal  value  are 
ensured. 


2.6.  Bounded  Condition  Numbers 

Implementations  of  the  affine  scaling  algorithm  for  solving  linear  programs  encounter 
nearly  singular  Hessians  if  large  step-lengths  arc-  chosen  such  thrt  the  iterates  lie 
very  near  to  the  boundary  of  P.  The  following  lemma  shows  that  for  nondegenercte 
linear  programs  this  difficulty  can  be  partly  eliminated  if  the  iterates  remain  in  a 
neighborhood  of  the  path  of  centers.  Implementations  in  [6,  10,  8]  show  that  with 
extrapolation  techniques  it  is  possible  to  generate  fast  algorithms  that  remain  in 
such  a  neighborhood. 

Lemma  5  (Estimate  of  worst-case  condition  numbers  for  the  matrices  H{y)  for 
nondegenerate  linear  problems) 

Consider  a  (primal)  nondegenerate  linear  program  and  any  algorithm  generating  a 
sequence  of  points  y*  in  a  a- neighborhood  of  the  path  of  centers  with  a  — 

Here  a  point  y  is  in  the  a-neighborhood  of  the  path  of  centers  if  the  Newton  step  h. 
starting  at  y  for  finding  the  “nearest”  center  measured  in  the  //-norm  is  less  than 
a,  i.e., 

A  :=  argmm  j|ZV(y,A)||//{s,iA)-i 

defines  the  “nearest”  center  x(A)  to  y,  and  the  corresponding  Newton  step 

h  :=  /i(y,  A)  =  -D2<p{y,  A)£>y?(y,  X)T 

satisfies  ||/i||//(y,A)  <  a.  Then  there  exists  an  c  >  0  depending  on  the  geometry  of  the 
problem  such  that  the  condition  numbers  of  the  Hessians  are  uniformly  bounded: 
cond2(//(y*))  <  1/f  for  all  k  >  0. 

To  prove  this  lemma  we  first  define  a  condition  number  f.  i  bounded  convex 
set  M  (the  “flatness”  of  the  set  M).  Using  the  two-sided  ellipsoidal  approximation 
of  the  set  P( A)  (by  the  matrix  H( x( A ) ))  we  then  obtain  a  bound  on  the  condition 
number  of  H{x)  for  x  near  the  path  of  centers.  The  proof  is  given  in  the  Appendix. 

I 

Note:  The  bound  1/e  on  the  condition  numbers  in  the  preceding  Lemma  de¬ 
pends  on  the  geometry  of  the  problem  (on  the  “flatness”  of  the  sets  /'(A)),  and 
unfortunately,  as  simple  examples  show,  the  magnitude  of  the  bound  of  the  condi¬ 
tion  numbers  may  be  as  bad  as  order  2L,  where  L  is  the  length  of  the  input  of  t lie 
problem. 

There  are  nondegenerate  examples  with  nonlinear  (e.g.  convex  quadratic)  con¬ 
straint  functions  for  which  no  such  bound  exists. 

2.7.  Extensions 

Nesterov  and  Nemirovsky  [12]  present  an  extension  of  the  method  presented  above 
handles  certain  non-convex  functions  /,  whose  level  sets  /,(x)  <  0  describe  convex 
domains;  for  example,  the  function  /  :  IRn+l  — >  IR  defined  by  f(x,t)  :=  1 1 j* 1 1 -  t2 
for  x  €  t  6  IR  and  t  >  \\x\\2- 

They  consider  the  case  that  the  functions  /,  are  not  necessarily  convex,  but 
their  barrier  functions  —  ln(  —  /i(x))  are  self-concordant  (and  hence  convex)  with 
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the  additional  property  that  there  exists  a  i)  <  oo  such  that  the  Newton  step  h 
starting  at  a  point  y  £  P°  for  finding  the  center  x  of  P  has  a  length  ||/i||//(j,)  <  d 
uniformly  bounded  for  all  y  £  P°.  Some  of  the  results  presented  above — like  the 
outer  ellipsoidal  approxii.  fion  of  P  or  equation  (2.4) — no  longer  hold,  but  the 
convergence  of  Newton’s  method  for  finding  the  center  is  the  same  and  convergence 
of  a  modified  barrier  method  can  be  maintained  as  well. 

Another  modification  of  the  method  is  an  acceleration  whc"  fo'lowing  the  smooth 
path  of  analytic  centers  x(X)  for  A  >  A*.  The  tangent  to  this  path  can  be  computed 
(as  //(y)~'c)  and  used  as  a  predictor  for  a  next  iterate  down  the  path  of  centers, 
while  only  one  or  two  steps  of  Newton’s  method  (with  line  search)  will  serve  as  a 
corrector.  Finding  the  right  compromise  of  staying  close  enough  to  the  central  curve 
on  the  one  hand  and  taking  large  steps  along  the  tangent  on  the  other  hand,  along 
with  efficient  lactorizations  (or  preconditioners)  of  the  matrices  H(y)  are  crucial  for 
a  practical  program.  Implementations  of  such  predictor-corrector  type  approaches 
are  promising;  see  e.g.  [6,  8,  7,  10]. 


2.8.  Concluding  Remarks 

There  are  some  difficulties  when  trying  to  deduce  statements  about  polynomiality 
from  the  above  method. 


2.8.1.  Irrational  solutions 

The  t/NP-  model  for  classifying  the  “difficulty”  of  classes  of  problems  is  unsatisfac¬ 
tory  if  one  considers  interior-point  methods  that  give  exactly  the  same  (theoretical) 
rate  of  convergence  for  linear  and  quadratically  constrained  convex  problems  For 
linear  problems  this  rate  of  convergence  implies  polynomiality  of  the  class  of  lin¬ 
ear  programming  problems,  since  one  can  round  the  exact  (rational)  solution  from 
a  sufficiently  accurate  approximation  in  polynomial  time.  For  the  class  of  convex 
quadratic  problems,  no  statement  about  polynomiality  can  be  deduced  from  this 
convergence  (since  a  quadratic  problem  may  have  an  irrational  optimal  solution 
that  to  date  cannot  be  computed  by  rounding  techniques).  It  is  appropriate  there¬ 
fore  in  a  more  general  context  to  define  the  notion  of  generalized  polynomiality  for  a 
class  K  of  problems  if  one  is  able  to  compute  the  exact  solution  of  any  problem  in 
K  up  to  d  digits  accuracy  in  a  time  that  is  bounded  by  a  polynomial  in  d  multiplied 
by  a  polynomial  in  the  length  of  the  data  of  the  problem. 

Tiic  definition  of  generalized  polynomiality  extends  the  notion  of  polynomial-time 
algorithms  in  a  natural  way  to  problems  that  do  not  necessarily  have  a  rational 
solution.  So  far  such  problems  have  escaped  any  classification,  since  the  exact  so¬ 
lution  often  could  not  be  computed  at  all,  even  if  there  was  a  good  algorithm  to 
approximate  it. 

Clearly  any  problem  that  is  polynomial  is  also  generalized  polynomial,  and  vice 
versa:  a  generalized  polynomial  problem  that  has  a  unique  rati  mal  solution  whose 
length  is  bounded  by  a  polynomial  in  the  length  of  the  data  is  also  polynomial  in 
the  classical  sense. 
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2.8.2.  Non-algebraic  functions 

Concerning  the  class  of  o-self-concordant  problems,  one  further  difficulty  in  extend¬ 
ing  the  model  of  polynomiality  is  that  the  “length”  of  the  input  cannot  be  measured 
in  a  natural  way  if  the  input  includes  non-algebraic  functions. 
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3.  APPENDIX 


The  Appendix  is  divided  into  two  subsections.  In  Subsection  3.1  we  state  some 
useful  and  general  results.  In  Subsection  3.2  we  present  the  proofs  to  which  we 
referred  to  in  Sections  1  and  2. 

3.1.  Some  Useful  Lemmas 

We  begin  in  recalling  a  slightly  generalized  version  of  the  well  known  Cauchy- 
Schwarz  inequality. 

Generalized  Cauchy-Schwarz  inequality: 

If  A,  A/  are  symmetric  matrices  with  \xTMx\  <  xT Ax  Va:  G  Wn,  then  also 

{aTMbf  <  aTAa  bT Ab  Va,6  G  Rn.  (3.1) 

Proof:  Without  loss  of  generality  assume  that  A  is  positive  definite.  (Else  Ae  := 
A  -f  tl  is  positive  definite  Ve  >  0,  take  the  limit  as  c  — ♦  0  for  fixed  a, 6.)  Assume 

further  that  a,b  ^  0  and  set  ft  :=  then  it  follows  from 

aTMb  =  |((a  4-  b)T  M{a  +  b)  -  (a  -  b)T  M{a  ~  6))  that 

(aTMb)2  <  -l((a  +  b)TM{a  +  b)  -  (a  -  b)T  M{a  -  b))2 
16 

<  -i((«  +  *)rA(a  +  b)  +  (a-  b)T A(a  -  b ))2 
lo 

=  — (2arAa  +  2bTAb)2  =  l(aTAa  +  bT  Ab)2 . 

16  4 

When  replacing  a  by  a/fi,  and  b  by  fib  this  implies 

(aF Mb)2  =  ((-)TM(fib))2  <  l(-^aTA«  +  fi2bT Ab)2 

fi  4  fl 

=  (aT  Aa)(bT  Ab).  ■ 

The  following  estimate  about  the  spectral  radius  for  symmetric  trilinear  forms 
was  observed  (without  proof)  by  [12]. 

Spectral  radius  for  symmetric  trilinear  forms: 

If  M  G  Mn*nxn  represents  a  symmetric  trilinear  form  A/  :  Mn  x  JRn  x  ZR”  — *■  1R 
and  A  G  lRnxn  a  symmetric  bilinear  form,  and  p  >  0  is  a  scalar  such  that 

M[h, h , h ]2  <  fiA[h, ft]3  Vft  G  lRn, 


then  also 

M[x,y,z}2  <  fiA[x,x\A[y,y\A[z,z]  Vx,y,zeIRn.  (3.2) 

Proof:  For  x  G  IRn  denote  by  Mx  the  (symmetric)  matrix  defined  by  yT Mxz  := 
A/X(y,z]  :=  M{x,y,  z]  Vy,z  G  j K”.  Without  loss  of  generality  let  ft  =  1  (else  substi¬ 
tute  A  by  tyfiA).  As  in  the  proof  of  (3.1)  assume  again  that  A  is  positive  definite. 


13 


By  substituting  M[x,y,z ]  :=  M[A~xl2x,A~xl2y,A~ll2z\  one  can  further  assume 
that  A  =  /  is  the  identity.  Finally,  it  is  sufficient  to  show  that 

\M[x,h,h}\  <  ||a?||2||A||2  Vx,heRn 

holds,  provided  that  M[h,h,h]2  <  ||/i|jf  V/i  G  2R"  is  true.  (The  remaining  part 
follows  by  applying  the  generalized  Cauchy- Schwarz  inequality  (3.1)  for  fixed  x  to 
Mx\)  Let 

p  :=  imx{M[x,h,h]  |  s.t.||x|j2  =  ||/>]|2  =  1} 

and  let  x,h  be  the  (not  necessarily  unique)  corresponding  arguments.  The  necessary 
conditions  for  a  maximum  (or  a  minimum  if  M[x,h,,h\  is  negative)  imply  that 


where  f3  and  p  are  the  Lagrange  multiplyers.  From  this  we  deduce  that  P  =  p/2 
and  p  =  p  (by  multiplying  from  left  with  (xT,hT))  and  therefore 

M-h(x  +  h)  =  p(x  +  h), 

which  also  shows  that  M[h,  f+A-]  =  p.  Starting  from  a  maximizing  triple 

||i+/i||2  ||i+n||2 

(x,  h,h)  this  gives  a  way  of  generating  an(other)  maximizing  triple  ( h ,  >  jj/^;{j|2  )■ 

Iterating  this  generating  process,  one  obtains  a  sequence  of  maximizing  triples  that 
converge3  to  a  triple  (-y(x  +  ph),7(x  +  /3h),i(x  +  /3h)).  By  continuity  of  M  this  triple 
is  also  maximizing.  By  assumption  however,  M[ i(x  -F  /3h),~f(x  +  fth), -y(x  +  fih)]2  < 
||7(x  +  /?fi)||6,  which  finishes  the  proof.  I 

In  the  following  a  quantitative  result  about  the  relationship  of  condition  of  the 
Hessian  matrix  of  <p  and  the  shape  of  the  sets  P( A)  is  stated.  For  this  purpose  it  is 
useful  to  define  a  condition  number  for  the  sets  /*(A). 

Definition 

Let  M  be  a  bounded  convex  set  in  lRn  that  contains  at  least  two  points,  and  let 
M  be  its  closure.  The  function  l  :  IRn  — *  IR  defined  by  l(y)  :=  ma\{j/T(«  - 
b)/\\y\\2  |  a,b  G  M }  measures  the  length  of  M  in  direction  y.  The  condition  number 
cond2(M)  G  [l,oo]  is  then  defined  by 

Tmax  :=  max{/(y)  |  ||y||2  =  1},  rmin  :=  min{/(y)  |  ||y||3  =  1},  cond2(M)  := 

'min 

and  is  a  measure  for  the  “flatness”  of  M.  If  A/  has  an  interior  point,  then  its 
condition  is  finite. 

3The  establishment  of  convergence  is  straightforward.  Assume  i'  h  =  9  >  0  (else  replace  h  by 
-h).  Define  *<*>  =  i,  =  h  and  r<*+,)  :=  (*<*>  +y*-,>)/||(z<‘»  +  " 1  ’ )!U -  We  show  that 

converges  to  7(1  +  0~h).  Writing  as  7*(z  +  0k~h)  it  follows  by  induction  that  0k  >  0  and 
7*  €  [j,  1]  (since  0  >  0!).  Computing  0k+]  ~  0k+i  +  ( 0k  -  shows  that  0k  is  a  linearly 

converging  sequence.  Hence,  7*  converges  also.  I 
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(Note  that  rmax  =  max{||a  -  &||2  |  a, b  G  M}  and  rmt„  =  min{||y||2  |  a  +  y  & 
M°  Va  G  A/}.) 

Proposition 

Let  M  C  IRn  be  convex,  x  G  Af ,  //  G  2R”xn  be  positive  definite  and  7  G  JR  be  such 
that 


x  +  h  £  M  whenever  <  1  and  x  +  h  &  M  whenever  ||/i||//  >  7- 

Then 

— cond2(M)  <  Jcond2(R)  <  7cond2(Af). 

7  v 

Proof:  Denote  by  v\  <  i/2 . . .  <  un  the  eigenvalues  of  H.  For  positive  definite 
H  the  condition  cond2(H)  is  given  by  Using  the  definition  of  rmax  and  rmtn  and 
the  ellipsoidal  approximation  of  M  it  is  straightforward  to  show  that 

2  27  ,  2  27 

r mm  —  I —  and  <  rmai  ^  — . 

y/Vn  y/Vn  \/V\  V^l 

From  this  the  claim  follows.  I 


3.2.  Proofs  from  Chapters  1  and  2 
3.2.1.  Proof  of  remark  2 

Let  the  function  /  fulfill  the  Relative  Lipschitz  Condition  (1.2)  in  y  and  let  f(y)  <  0. 
Considering  the  Lagrange  remainder  formula  for  the  function  g  :  1R  — *  JR,  g{9)  := 
f(y  +  Oh)  we  obtain  for  |l/ij|//(v)  <  0.5/(l  +  M1^3)  that 

f(y  +  h)  =  /(»)  +  £/(»)*»  +  \hTD2f{y  + 


with  /i  G  (0, 1). 

Suppose  now  that  f(y  +  h)  >  0,  then  we  have 


0  <  -f(y)  <  f(y  +  h)-  f(y)  =  Df{y)h  +  ^hTD2f(y  +  fih)h 

<  Df{y)h  +  \hTD2f{y)h{  1  +  Mp||fc||W(y))  (Rel.  Lips.  Cond.) 

<  Dim + 5*t»7(»wi + Cjfrnjnsj)- 


From 


/D/(y)/ix2  hTP2f(y)h 

{  -f(y)  >  -fiy) 


=  Mlm  < 


1 

4(1  +  A/1/3)2 


follows  further  that 


-f(y) 

2(1  +  A/1/3) 


and 


hTP2f{y)h  -f(y) 
~f(y)  "  4(l  +  A/J/3)2- 
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Substituting  this  into  the  first  inequality  we  obtain 

n  ,,  x  .  -f(v)  ,  i  -f(y)  n  , 

°<  2(1  + A/1/3)  +  2  4(1  +  A/1/3)2'1  +  ^2(1  +  A/1/3) 

=  _/(j,)(2(l  +  M1/3)  +  8(1  +  A/1/3)2  +  16  (1  +  Af1/3)3)  < 

which  is  a  contradiction. 

So  f(y  +  h)  must  be  negative  as  well.  I 


3.2.2.  Proof  of  Lemma  1 


Suppose  a  function  /  is  three  times  continuously  differentiable  in  a  point  y  £  P°  and 
fulfills  the  Relative  Lipschtiz  Condition  (1.2)  in  y.  We  verify  the  self- concordance 
for  an  arbitrary  fixed  direction  h  £  JRn.  Using  (1.2)  one  can  bound  the  function 
g  :  M  —*  M, 

9(0)  :=  D2f(y)[h,  h]  -  D2f(y  +  Oh)[h,  h) 


by 


\g(0)\  <  M\\0h\\H(y)D2f(y)[h,h\ 


for  sufficiently  small  Oh.  The  definition  of  |j.||//(y)  allows  to  continue 


|5WI  <  +  (Mffil’V/WlM] 

5  (-/(»))</»  +  -/(»>  >■  (  1 

(Using  that  y/a  +  y/b  >  \/a  +  b  for  a,b  >  0.)  Since  g'(0)  =  -D3f(y  +  0h)[h,h,h]  it 
follows 

g(0)  =  ff(0)  4-  Og'(yO)  =  -0D3f(y  +  y9h)[h,h,h) 
for  some  y  £  (0, 1).  Hence 

ls(0)!  =  \0D3f(y  +  y0h)[h,h,h}\. 


Sustituting  this  into  inequality  (3.3)  yields 

„ . .  „,D2f(y)lh,h}3/2  ,  D2f(y)[h,h)Df(y)[hU 

\D3f(y  +  y0h)[h, h, h]|  <  M {  .  ^  + - 1773 ) • 


(-/(!/)) 

In  the  sequel  it  is  helpful  to  abbreviate  the  quantities 
Df(y)[h) 


dx  := 


-/(»)  ’ 


=  and  d3 
-f(y) 


D3  f(y)[h,h,h\ 

- f(V ) 


(3.4) 


Without  loss  of  generality  assume  d\  >  0  (otherwise  substitute  h  by  -h).  By 
convexity  of  /  also  d2  >  0.  Inequality  (3.4)  is  true  for  sufficiently  small  Oh  and  some 
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p  =  n(Q)  g  (0,1).  Deviding  (3.4)  by  -f(y)  >  0  and  taking  the  limit  6  — *  0  we 
obtain 

\d3\  <  M(4/2  +  d2di).  (3.5) 

Self-concordance  is  defined  by  the  derivatives  of  ip(y)  :=  -ln(-/(y))  in  (1.1).  Ob¬ 
serve  that 

D^(y)[h]  =  di,  D2<p(y)[h,h\  =  d2  +  and  D3<p(y)[h,h,h\  =  d3  +  3d2di  +  2d3. 
Using  (3.5)  we  estimate 

\D3<p{y)[h,h,h)\  <  (<f3|  +  3di d2  +  2 d\  <  Mdf  2  +  (M  +  Z)dxd2  4-  2 d?. 


Comparing  this  with  (D2tp[h,h])3/2  =  (d2  +  d\)yj d2  +  it  becomes  obvious  that  a 

suitable  multiple  of  (D2<p[h,  h})3/2  upper  bounds  |r>3y>(^)[/i,/i,/i]|,  but  finding  the 
best  possible  multiple  is  a  tedious  work  which  we  would  like  to  banish  to  a  footnote4. 
Recalling  definition  (1.1)  we  see  that  y/a  =  1  +  M  gives  precisely  the  inequality  in 
the  footnote.  I 


3.2.3.  Proof  of  Lemma  2 

For  the  sake  of  completeness  we  state  this  proof  which  is  already  given  in  Theorem 
1.1  in  (12]  in  slightly  modified  form. 

Let  an  a-self-concordant  function  <p,  a  point  y  £  P°,  the  gradient  D<p(y)  =  g(y),  the 
Hessian  matrix  D2<p(y)  =  H(y),  an  arbitrary  vector  h  £  Rn  with  S  =  ||/i||/f(y)  <  ^ 
and  an  arbitrary  vector  z  £  lRn  be  given. 

Let  s  £  [0, 1]  be  such  that  y  +  sh  £  P° .  We  first  show  that  for  such  s  the  inequality 

(1  -  s£Va)!M|//(v)  <  ^  i  (3-6) 

holds.  In  a  second  step  one  can  then  show  that  for  s  =  1  still  y  +  sh  £  P°.  To 
evaluate  how  the  ff-norm  of  the  vectors  h  and  z  changes  for  different  matrices 
H(y  +  ph),  with  p  £  [0,s]  let  us  define 

T(/>)  •-  \M2H{y+ph)  =  hTD2<p(y  +  ph)h  >  0  and 

*(P)  ==  IMI//(y+M)  =  zT&ip(y  -I-  ph)z  >  0. 

4 Abbreviating  again  a  =  d\,  b  —  y/dj  we  obtain 
|z?3VP(y)[M,/i]|2  <  (Mb3  +  Mab 2  +  3 ab2  +  2a3)2 

=  M2b6  -f  2 M2ab5  +  6 Mab5  +  4Afa3fc3  +  M2a2b 4  +  6Ma2b*  +  \Ma*b 2  +  9 a2b*  +  12a462  +  4a6 
Using  that  2 ab  <  a2  +  b2  we  eliminate  all  odd  powers  and  summarize 

<  2A/26®  +  2M2a2b*  +  UMa2b 4  +  3M6®  +  6Ma*b2  +  9o2fr4  +  12o462  +  4a®. 

<  (4  +  SM  +  4M2)(a*  +  3a462  +  3a264  +  6®)  =  4(1  +  M)2(a2  +  b2)3  =  4(1  +  Mf  (^D2<p[h,  A]) 

Summarizing  and  taking  square  roots  we  get  |l?39(y)(A,  h,  A]|  <  2(1  +  M)(D2<p(y)[h,  h])3*2 . 
(Actually,  even  the  constant  2(1  +  ~feM)  would  work.) 
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In  order  to  show  (3.6)  we  will  show  that  the  function  $  is  “nearly  constant”.  The 
changes  of  T  and  $  can  be  estimated  by  their  derivatives  r'(p)  and  $'(p)  using  the 
estimate  about  the  spectral  radius  for  symmetric  trilineax  forms  (3.2)  proved  earlier: 
From  the  a-self-concordance  of  <p  follows  with  (3.2)  that 

\D3(p(y)izi,z2,z3\\  <  2s/raD2'p(y)[zi,zi}1/'2D'i<p(y)[z2,z2}1/2D'2<p(y)[z3,Z3}1/2 

which  implies  that 

|r'(p)|  <  2y K(hTD2<p{y  +  ph)h)3' 2  =  2^T(pf/2  and 

I^V)!  <  2\fa(hT D2<p(y  +  ph)}^^ (zT D2<p(y  +  ph)z)  =  2\/«r (p)1/2$(p). 

Using  the  first  inequality  one  can  show  that  T1/2  is  “small”,  and  with  the  second 
inequality  this  implies  that  |$'|  is  “small”.  There  are  two  cases: 

1.  r(p0)  =  0  for  some  p0  G  [0,s].  This  implies  T(p)  =  0  for  all  p  6  [0,s] 

(by  integrating  T'(p)dp  for  small  |c|  and  using  the  first  inequality) 

and  then  $'(p)  =  0  for  all  p  G  [0,s]  and  $(s)  =  $(0)  which  implies  that  the 
tf-norm  of  z  does  not  change  at  all  and  that  (3.6)  is  true. 

2.  r(p)  >  0  Vp  G  [0,s].  In  this  case  one  can  bound  Yll2(p)  s  follows: 
|^(r(p)-1/2)|  =  |±r(p)-3/2n(p)|  <  y/a  Vp  G  [0,s]  which  implies  that 
r~1/2(p)  >  r-1/2(0)  -  py/a  =  |  -  Py/oi  >  0  (by  definition  of  tf)  or  that 
r1/2(p)  <  <5/(1  -  pSy/a).  Inserting  this  in  the  second  inequality  one  obtains 

l*'WI 5 

Again  one  may  conclude  (like  above)  that  either  $(p)  =  0  on  [0,  s]  (in  which 
case  there  is  nothing  to  show)  or  $(p)  >  0  on  [0,s].  If  $(p)  >  0  one  can 
estimate 

|(ln$(p))'|  -  \^\aWf))\  <  -  and  thus 

i‘“*f(5))i  ~  i ln  4<s)  - ln  4(0)1  =  1  *ip)Y  M  £  l 

=  -2ln(l-^vS)|’=21n(r-l-5). 

This  implies  that 

$(g)y/2  i  ,  ^(Q)y/2  <  1 

$(0)'  ~  1  -  shyfo.  an  '$(«)'  “  1  -  sbyfo. 

which  is  inequality  (3.6). 
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A  short  proof  by  contradiction  shows  that  s  =  1  is  possible,  i.e.  that  1  =  max{p  G 
[0, 1]  |  y  +  ph  €  P°}:  Suppose  on  the  contrary  that  1  >  s  :=  sup{p  G  [0, 1]  |  y  + 
ph  G  P°),  then  by  the  inequality  (3.6)  it  holds  that  D2ip(y  +  ph)  is  bounded  for 
all  p  G  [0,  s),  and  thus  <p(y  -+•  ph)  is  bounded  for  all  p  G  [0,  s).  The  strong  self¬ 
concordance  of  <p  implies  that  <p(x)  goes  to  infinity  as  x  approaches  the  boundary 
of  P ,  limx_ap  <p(x)  =  oo  so  that  y  +  sh  &  dP ,  the  contradiction  we  were  looking 
for.  I 


3.2.4.  Proof  of  Remark  3 

Substituting  z  by  h,  the  Relative  Lipschitz  condition  reduces  to 

\hT{P2<p(x  +  h)~  D2<p(x))h\  <  /3(hT D2v(x)h)3'2). 

Defining  p(t)  :=  hT(P2<p(x  +  th)  -  P2tp(x))h,  p'(t)  =  P3(p(x  +  th)[h,k,h],  one 
obtains  from 

p(t)  <  0(hTP2<p(x)h)3/2t  for  t  >  0  that  p'(0)  <  0{hT P2tp(x)h)3/2 .  This  is  exactly 
the  condition  for  a-self-concordance  with  0  =  2y/a  from  which  the  claim  follows.  | 


3.2.5.  Proof  of  concavity  in  (2.2) 

The  proof  of  concavity  of  $  given  in  [14]  before  statement  (2.8)  can  be  generalized 
in  a  straightforward  way  to  nonlinear  convex  functions  /,:  The  term  (A  -  }o{x))q 
has  the  same  structure  as  the  remaining  m  terms  and  is  therefore  omitted  here  for 
the  sake  of  clearity.  One  obtains 


DV{x) 

¥(*) 


Dfii*) 

fi(x) 


and 


P2V(x) 

*(x) 


DT9{z)D9{z)  _  a  _  1  D2m 

*2(x)  V  V(x>  m  “  fi(x) 


DTfi(x)Df,(x) 

/?(*) 


Hence, 

P2V(x)  _  1  A£>2/,(x)  DTfi{z)DJi{z)  1  /fl?T/i(x)y^M 

*(x)  /,(*)  /?(*)  /<(*)  /<(*) 

Note  that  for  arbitrary  vectors  h  and  G  iR"  we  have 

hT(m^2vivl)h  =  m]T(t;,Th)2  >  (J2VI hf  =  hT(('%2vi)(jL,v7))fl- 

i  i  t  i  * 

Taking  Vi  :=  £>T/i(x)//,(x)  and  observing  that  fi(x)  <  0  and  'J'(x)  >  0  this  implies 
that  P2V(x)  is  negative  semidefinite  i.e.  ¥  is  concave.  I 
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3.2.6.  Proof  of  Lemma  3 

This  proof  proceeds  in  two  steps.  First  we  show  that  the  function  <p  is  well  approxi¬ 
mated  by  its  quadratic  Taylor  approximation  q.  In  the  second  step  this  information 
is  used  to  relate  the  (ellipsoidal)  level  sets  of  q  to  the  level  sets  of  $  = 

Let  y  £  P°  be  arbitrary  and  define  the  quadratic  approximation  qy  of  in  y  by 

qy(x)  :=  ip(y)  +  Dtp{y)(x  -  y)  +  ^(x  -  y)T  D2p>(y)(x  -  y). 

For  h  €  lRn  and  sufficiently  small  fi  (such  that  ||M||W(y)  <  l/\/a)  define  the  differ¬ 
ence  of  and  qy  in  the  point  y  -f  /.ih  by 

d(n)  :=  qy(y  +  nh)  -  < p(y  +  fih). 

The  Lagrange  remainder  formula  applied  to  d  yields 

u3 

d(^)  = —d  {vfi)  with  some  // 6(0,1). 

Using  the  definition  of  self-concordance  one  obtains 

d"'(n)  -  — =  D3q>(y  +  fih)[h,h,h]  <  2y/a  D2y?(y  +  nh)[h,  h}3/2. 

For  ||/i/i||//(y)  <  -y-  this  can  further  be  bounded  by  Lemma  2: 

Inserting  the  last  two  estimates  in  the  above  Lagrange  remainder  formula  allows  us 
to  continue 


d(n)  <  ~r2y/a^D2ip(y)[h,h\— 


V/»II"M||  i/(y))2, 


<  y/al- 


3(1-  V^HMIli/u))3 


for  \\ph\\H{y)  <  1  /y/a.  This  completes  the  first  step  of  the  proof. 

The  last  inequality  will  now  be  used  to  obtain  information  about  the  increase  of 
-  and  thus  also  about  the  decrease  of  $  (defined  in  (2.2))  -  around  its  maximum  x 
(the  analytic  center  of  P).  This  allows  to  construct  a  decreasing  linear  function  on 
the  ray  x  +  ph,  fi  >  0  that  bounds  the  concave  function  'P  in  /z  6  [1 ,  oo)  from  above. 
Here,  for  h  6  IRn  we  define  h  :=  /i/(4v/a||/?jj//(y)).  The  estimate  of  d  for  h  =  h  now 
implies  that 

<  16||/t|l"(y> 

V  1  “  3  (1  -  1/4)3  “  81 


Since  q£(x  +  h)  =  <p(x)  +  %\\h\\ jf(i)  it  follows  that 


<P(i  +  h)  >  p(x)  +  -||fe||J/(J.,  - 


16ll^ll//(n)  _  ,  . ,  3  MI- (l2 

>  l,?('r) +  To II  * II "U 
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Using  this  and  the  definition  of  $  yields 

*(*  +  h)  =  exp <  exp(— ^^)exp(^||4tf(i)) 


m 


m 


=  $(i)eXP(^ll^ll//(x))- 

Let  2  :=  -  <  jU.  Since  exp(t)  <  1  -f-  t  +  |t2  for  t  <  0  one  can  conclude  that 

,  3  .  ^  ,  3  9  2  ^  ,  2 

10  ;  “  10  200  -  4’ 


which  implies  that 


*(i  +  fc)  <  »(*)(!  -  -)• 


Setting  ty(x)  :=  —  oo  for  x  P,  is  well  defined  and  concave  everywhere  in  Hln . 
Now  let  h  be  such  that  ||/i||W(£)  >  16mv/o  =  f  ||^||//(r)x  then  we  have  ty(x  +  h)  <  0, 
i.e.  x  +  h  £  P  (since  'I'(x)  >  0  for  x  £  P).  Q 


3.2.7.  Proof  of  Lemma  4 

(Simplified  version  of  the  proof  in  [12],  Prop.  1.2,  Th.  1.2,  1.3  and  1.4) 

Define  y(s)  :=  y  +  sh  for  s  £  [0, 1]  where  h  =  -D2<p(y)Dip(y)T  is  the  Newton  step 
starting  in  y  to  minimize  <p,  then  by  Lemma  2:  y(s)  £  P°  for  all  s  £  [0, 1]  and 

Using  the  generalized  Cauchy-Schwarz  inequality  (3.1)  and  defining//  :=  \/<x\\h-\\H(y) 
we  obtain 

| ~Dip(y(s))z  -  hTD2p(y)z\  =  | hT{D2<p(y{s))  -  D2<p(y))z\  < 

*  ‘(T^o*  -  ^‘TDMv)z^DMv)h  =  (j~—i  -  l)ll*i„w^. 

The  left  hand  side  is  the  absolute  value  of  the  derivative  n'(s)  where  k  is  defined 
k(.s)  :=  Dif(y(s))z  -  (1  -  s)Dip(y)z.  By  integration,  (k(0)  =  0)!  one  can  thus  bound 

I«WI  <  /'  - 1  d«  = 

v°  Jo  (1  -  tpr  vQ  1  —  s/2 

For  s  =  1,  y(s)  =  y  +  h  this  implies 

l*(l)l  =  \Dv{y  +  h)z\  <  £ —Mpl, 

1  —  /X  •y fa 

Choosing  z  =  h  —  -D2<p(y  +  h)~lDtp(y  4-  h)T  as  the  next  Newton  step  one  obtains 

the  last  inequality  following  from  Lemma  2.  With  6  :=  n/y/a  the  claim  follows 
when  deviding  the  last  line  by  ||^||//(!,+/l).  I 
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3.2.8.  Proof  of  Remark  4 


The  previous  Lemma  implies  that  if  S  =  S0  <  and  Newton’s  method  starting  in 
yo  :=  y  is  iterated  one  obtains  a  sequence  of  strictly  feasible  points  y*  =  Vk-i  +  ^k- 1 
for  A:  >  1  where  the  norms  ||-||//(y*)  °f  the  Newton  steps  hk  converge  to  zero.  Defining 

h  '•=  and  7  :=  it  follows  (again  from  Lemma  4)  that 

h  <  1/61-1  <  ^(T^o)2*- 

Here,  the  norm  of  the  first  Newton  step  ho  is  60  =  ||MI//(y0)  —  97^-  By  Lemma 
2  this  implies  for  any  2  €  lRn  that 

1  9 

ll2llw(yi)  ^  y— ^HMIff(yo)  —  1 1  ^  1 1  //  ( yo )  • 

Since  6k  is  a  decreasing  sequence,  this  relative  change  in  two  subsequent  norms  is 
always  bounded  by  |  so  that 


IIMUto)  <  <  i(^)‘  and 

Since  lim,.^  ll*fcll//(y0)  =  0  also  Vl  ~+ 


00  1  2-v2  7 

fc=l  ~  1  215 

x  and  the  claim  follows.  | 


3.2.9.  Proof  of  Remark  5 


(i)  First  let  /  be  a  convex  C2-function,  for  f(x )  <  0  be  (f(x)  :=  ~ln(-/(x)) 
its  logarithmic  barrier  function,  g(x)  :=  Dip(x)T  =  the  gradient  of  <p  and 

H(x,t)  :=  //(x)  +  e/  =  D2ip(x)+d  =  +  ^rj^f  +  d  a  perturbed  Hessian 

matrix  of  <p.  Then  H(x,()  is  positive  definite  for  all  c  >  0  and 


M*)\\ 


2 


Df(x),Df(x)TDf(x)  D2  f(x)  n-i  Df(x)T 
-fix)  '  P(X)  -/(*)  +  )  -fix)- 


To  simplify  let  v  :=  and  G  :=  D_  J$  +  c/,  then  G  is  positive  definite  and 


-/(*) 

,.T\- 1„  _  ,.T  ( 


lls(x)ll//-.(r,t)  =  vTi°  +  VvT )  ^  =  vTi°  *  -  f+pTC-ip)”  <  L 


(Note  again  the  equality  ||ff||^-i  =  gT H~lg  =  hT Hh  =  ||/i||//  for  h  =  H~lg.) 

(ii)  The  second  part  of  the  proof  now  follows  immediately  (taking  the  limit  as 
c  — *  0)  from  another  “Cauchy-Schwarz-type”  inequality  stated  in  Proposition  3.5  of 
[12]  without  further  comment  (or  proof). 

If  =  (ii  for  1  <  i  <  m  (with  positive  definite  matrices  //,) 

then  ||E*lljU)-i  <  EM.- 
Proof:  Observe  that 

(ii  =  min{p  >  0  |  igjh)2  <  (ihT Hih  V/i  €  Bin}.  We  want  to  show  that 
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p  :=  min{p  >  0  j  ((E^f)^)2  S  p/iT(E#i)^  V/i  €  JRn}  fulfills  p  <  Epi-  We  may 
also  write 

f i  =  min{p  >  0  |  (E gjh )2  <  pEhT//,h  V/i  :  (yT'/i)2  <  pihT  Hih)  (since  the  last  set 
of  inequalities  is  by  definition  of  p*  always  satisfied).  From  this  definition  of  p  it  is 
obvious  that  p  >  pi  for  any  p  satisfying 

0  =  sup{(^a,)2  -  pY^Pi  I  **i  €  R,  Pi  >  0, a2  -  p,0,  <  0}. 

(If  we  added  the  additional  restriction  that  a;  =  gjh  and  /?,  =  hT Hth,  then  ob¬ 
viously  p  =  p  would  be  feasible,  i.e.  would  be  satisfying  that  the  “sup”  =  0.  By 
allowing  a  (possibly)  larger  set  a,,  P,  here,  the  “sup”  may  increase  and  it  might 
require  a  larger  value  p  >  p  to  ensure  feasibility  of  pi.)  For  any  p  ^  0  the  sign 
of  the  “sup”  is  invariant  under  the  transformation  “Vi  :  /3,  — *  p20,,  a,  — ►  pa,”. 
Hence  we  may  add  the  additional  constraint  E Pi  <  1  while  keeping  the  same  set  of 
feasible  values  p  and  guaranteeing  that  on  the  resulting  compact  domain  the  “sup” 
is  actually  a  maximum  for  which  we  can  consider  the  necessary  conditions.  For  this 
purpose  define  e  :=  ( 1 , 1 , . . . ,  1  )T  €  Rm , 
a  :=  (ov,a2, . . .  ,am)T  G  ZRm,  £o  :=  (0, . . .  ,0;er)r  G  R2m  and 

:=  (0, . . .  ,0,2a,, 0 . 0;0 . 0,-p„0 . 0)T  6  R2m 

for  1  <  j  <  m,  where  only  the  t-th  and  the  (m  +  i)-th  entry  of  £,  are  nonzero.  Since 
Pi  =  0  implies  a,  =  0  we  may  restrict  ourselves  to  Pi  >  0.  (And  also  p,  >  0.)  Then 
the  necessary  conditions  for  a  maximum  imply 

(2era  eT\  -peT)T  =  po£o  +  +  •  •  •  +  Pm£m, 


where  p,  G  R  are  the  Lagrange  multipliers.  For  i  >  1  we  deduce  from  the  (m  +  t)-lh 
entry  of  that  p ,  =  iL^Esl,  and  the  i-th  entry  tells  us  then  that  o,  =  eT a  . 
Substituting  this  in  the  “sup”  yields 


Substituing  now  0,  we  may  continue 


eTa  \2 


Factoring  ( jpf ^)2  >  0  it  is  obvious  that  the  last  term  is  zero  for  any  p  >  E/1^  (and 
in  particular  for  p  =  E^i)-  ■ 


3.2.10.  Proof  of  Remark  6 

Let  again  H{x)  =  D2<p{x),  g(x)  =  Dip(x)T  and  h(x)  =  -H(x)~1  g(x)  be  the  Newton 
step  starting  in  x  for  finding  the  analytic  center  x  of  P.  Let  y  G  P°  and  f  <  J  be 
given  such  that  ||i  -  y||p(f)  <  and  set  h  :=  x  -  y. 
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For  some  fixed  vector  z  6  lRn  define  l  :  1R  — 1 ►  1R  by  /(p)  :=  g(y  +  fih)T  z,  then 
1(1)  =  0  (for  any  z)  and 

l'(fi)  =  hT H(y  +  /t h)z,  l"(n)  =  D3<p(y  +  nh)[h,  h,  z ], 

and  l(n)  =  1(0)  +  p/'( 0)  +  ±p2/"(£)  for  some  £  6  [0 ,p]  by  the  Lagrange  remainder 
formaula.  For  /i  —  1  one  obtains 

o  =  y(t/)Tz  +  hT  H(y)z  + 


From  a-self-concordance  follows  with  (3.2)  that 

| zT(g(y)  +  H(y)h) \  <  ^  hT H(y  +  ih)h  (zT H(y  +  th)z)1'2. 

Let  d  :=  g(y)  +  H(y)h  (then  H(y)~ld  =  h(y)  -  h)  and  let  z  :=  = 

(^(yj-Q)1/? »  then  the  above  formula  reduces  to 


dTH(y)-ld  ^  /-liril2  \\«(y)  'dWHtoHh) 

(dT H(y)~ld)1/2  -  v/a||l!l"(v+W  \\H(y)-'d\\H(y) 


(3.7) 


By  assumption,  |l/t||W(y+^)  <  Relating  the  norms  ||.||//(a+fc),  IMI//(y+<&)  and 

II  - 1 1  //( y )  again  by  lemma  2  it  is  straightforward  to  show  that  for  c  <  *  inequality 
(3.7)  implies 

\\ll(s)-'d\\„ht  =  (dTff(y)-,d),,i  < 

for  any  £  6  [0,1].  )5  Note  that  x  -  II(y)~xd  is  the  result  of  the  Newton  step. 
Applying  Lemma  2  one  more  time  yields 

\\H(y)-'d\\H{i)  <  _  —3 

which  establishes  quadratic  convergence.  I 


3.2.11.  Proof  of  (2.3) 


Suppose  yk  satisfies  6k  :=  ||/i*||//(y,  ,\k)  =  ||Myk^Jt)ll//(y*,.v*)  <  1  /(-^  )  and  j/*+,  - 


yk  +  hk- 

By  Lemma  4  then  6  :=  ||/i(wfc+1 ,  A*)||//(yjk+1  Ajk)  <  1/(9^)- 

We  examine  the  effect  on  h  and  II  caused  by  the  update  of  Xk+) .  Denote  by  g(y.  A) 


the  gradient 


g(y,X)T  :=  bq>(y,X)  =  q  _  ^ 


.  y-  DM) 


4  For  {  =  0  or  {  =  1  it  follows  directly  from  Lemma  2,  for  {  €  (0, 1)  it  follows  when  applying 
Lemma  2  twice  and  using  t  <  $■. 
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{f  as  defined  in  (2.1))  and  h(yk+ uXk)  =  ~H{yk+l,  \k)  lg{yk+ i,Afc)  (<  l/(9,/a)). 
The  update  of  Xk+x  —  \k  -  cr( Aa.-  -  cTyk+x)  effects 

CTC 

9(yui^k+\)  =  g{Vk+i,  Afc)  +  <?- - J -  =  g{yk+i,  Xk)  +  qoc 

*k+ 1  ~  c  Vk+i 

with  c  =  c/(\k+ 1  -  cTyk+i).  The  //-,-norm  of  the  second  part  can  be  bounded  by 

/  T  _ »  \  1/2 

j-i)"1  =  (<l*c  H(yk+u*ki\)  qcrc) 

<  <jq(cT  (ll(yk+uoo)  +  qccTSj  cj  ' 2  <  a^q 
(since  7/(yjt+i,oc)  is  positive  definite). 

Further,  from  H(yk  +  \ .  ^k+i )  =  H  (Vk+\ ,  A^)  +  q<J2ccT  follows  that 

IIMyfc+l.Afc+i  )ll//(y*+,.A*+l)  =  l|9(yAr+l'^fc+l)ll//(y*  +  ,.A*+i)-' 

<  \\g(yk+l<  Xk)\\n(yk+l,\k  +  i)--  +  lto«rc|l//(y*+,>A*+,)-» 

<  IMytc+i,Att)ll//(y*+1,A*)-'  +  +  °Vq- 

(Here,  q  is  chosen  q  =  m.)  This  siiows  (2.3)  I 


3.2.12.  Proof  of  (2.4) 

Denote  the  analytic  center  x (Xk)  by  x,  then  the  iterate  yk+x  meets  the  assumptions 
of  Remark  1  so  that 

\\yk+\  ~  *ll//{yfc+1.Afc>  <  ||^(j/t+i ,  Afc )||//(yfc+l ,Ajk)  +  \\yk+\  +  h(y*;+i,Afc)  -  x||//(tffc+1 ,a*) 

<  _L  Vo(_L\2<  _L_ 

“  9i/a  +  2  V9v/S7  ~  54y/a 

By  the  equivalence  of  the  //-norms  (Lemma  2)  the  same  distance  measured  in  the 
central  norm  fulfills  \\yk+i  -  z||//(f,A*)  <  ||yfc+i  -  i||H(y*+1,Afc)/(l  -  £)  <  From 
the  inner  ellipsoidal  approximation  of  P(A)  in  part  1  of  Lemma  2  follows  that  yk+i 
is  at  most  15  percent  “away”  from  the  center, 

cTy.+x  -  cTx  <  0.15(  max  {cTx\  -  cTx)  <  0. 15( Ajt  -  c‘  x). 
r€P(  A*) 

It  is  easy  to  show  (see.  e.g.  [3],  Lemma  (3.8))  that  in  the  case  of  convex  constraint 
functions  /,  and  a  linear  objective  function  /0  the  following  inequality  holds  in  the 
analytic  center  x(A)  of  P( A): 

A  -  cTx(A)  >  ^(A  -  A*). 

(Only  for  q  >  m  in  (2.1)).  With  the  previous  result  this  implies  that 

A-oTyfc+1>^(A-A*) 

from  which  the  claim  follows.  I 
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3.2.13.  Proof  of  Lemma  5 


In  the  beginning  of  this  chapter  the  condition  number  of  a  bounded  konvex  set  M 
has  been  defined.  Under  the  assumptions  of  section  1,  the  set  P  =  P( A  =  oo) 
has  a  finite  condition  cond2(P)  <  oo.  Since  the  program  is  also  nondegenerate 
there  exists  a  Ao  >  A*  (the  unknown  optimal  value)  such  that  P(Ao)  is  a  simplex 
5  bounded  by  the  ojective  function  fo{x)  <  Xq  and  the  n  linearly  independent 
constraints  that  are  active  in  the  optimum.  For  A  6  (A*,Ao]  the  set  P( A)  is  sim¬ 
ilar  to  5  and  cond2(P(A))  =  cond2(S).  It  is  a  simple  exercise  to  verify  that  for 
A  >  A*  the  condition  cond2(P(A))  is  a  contiuous  function  of  A.  Since  the  limits 
limA-^oo  cond2(P(A))  =  cond2(P)  and  limA— a*  cond2(P(A))  =  cond2(S)  are  finite 
one  may  conclude  that  there  exits  a  number  C  <  oo  such  that 

cond2(P(A))  <  C  for  A  G  (A*,oo). 

Remark:  In  special  cases  it  may  happen  that  for  some  A  6  (A*,oo)  the  condition 
numbers  are  not  monotone  and  cond2(P(A))  >  max{cond2(P),  cond2(S)}  it  seems 
however  that  always  cond2(P(A))  <  cond2(P)  +  cond2(S)  holds. 

We  recall  the  ellipsoidal  approximation  of  the  sets  P( A)  around  the  analytic 
centers  x(A)  with  the  matrices  H(x( A),  A)  and  7  =  m  -  1.  (For  the  case  of  a  Linear 
Program  [14]  proved  a  better  inclusion  with  similarity  ratio  (m  -  1).) 

Therefore  cond2(.//(x(A),  A))  <  (m  -  1  )2C2  =:  C  for  A  g  (A*,  00). 

Now  let  a  point  yk  lie  in  the  domain  of  quadratic  convergence  of  Newton’s  method, 
i.e.  6  :=  \\h(yk,  Afc)||//(ytiAt)  <  where  x(^k)  is  the  “nearest”  center.  From  the 
proof  of  (2.4)  (and  from  Lemma  2)  follows  that  the  Hessian  matrices  in  x(A^)  and 
in  yk  +  h(yk,  Xk)  fulfill 

0.85||z||W(l(AJ[),At)  <  IMIw(y*+Myfc.Ak),A*)  ^  q^II2IIw(x(A*)>*) 

for  any  x  6  Mn.  Similarly,  the  Hessian  matrices  in  yk  and  in  yk  +  h{yk, Xk)  fulfill 
the  same  relationship  with  the  factors  |  resp.  5  (by  Lemma  2).  Putting  this 
together  the  eigenvalue.:  of  the  Hessian  matrices  in  yk  and  x(Ajt)  change  at  most 
by  a  factor  of  0  e  (fjpf?)  so  that  cond2(//(j/jfc,  A*))  <  |Y^cond2(//(x(Ait),  A*))  < 
3cond2(//(x(A*),  A^)).  This  completes  the  proof  of  Lemma  5. 


26 


References 

[1]  M.  Groetschel,  L.  Lovasz,  S.  Schrijver,  Geometric  algorithms  and  combinatorial  optimization , 
(Springer  Verlag,  Heidelberg,  1988). 

[2]  D.  den  Hertog,  C.  Roos,  T.  Terlaky,  “On  the  classical  logarithmic  barrier  function  method  for  a 
class  of  smooth  convex  programming  problems”,  Report  90-28,  Delft  University  of  Technology, 
The  Netherlands  (1990). 

[3]  F.  Jarre,  “On  the  convergence  of  the  method  of  analytic  centers  when  applied  to  convex 
quadratic  programs”,  Report  No.  35  (1987),  Schwerpunktprogramm  der  DFG  Anwendungsbe- 
zogene  Optimierung  und  Steuerung.  To  appear  in  Mathematical  Programming  49  (1991). 

[4]  F.  Jarre,  “The  method  of  analytic  centers  for  solving  smooth  convex  programs”,  Lecture  Notes 
in  Mathematics ,  Vol.  1405  Optimization,  S.  Dolecki  ed.,  Springer  (1989)  69-85. 

[5]  F.  Jarre,  The  method  of  analytic  centers  for  smooth  convex  programs ,  (Grottenthaler  Verlag, 
Bamberg,  1989). 

[6]  F.  Jarre,  G.  Sonnevend,  J.  Stoer,  “An  implementation  of  the  method  of  analytic  centers”,  in: 
A.  Benoussan,  J.L.  Lions  eds.,  Lecture  Notes  in  Control  and  Information  Sciences  111  (New 
York,  Springer  1988). 

[7]  I.J.  Lustig,  R.E.  Marsten,  D.F.  Shanno,  “On  implementing  Mehrotra’s  predictor  corrector 
interior  point  method  for  linear  programming”,  Technical  Report  SOR  90-03,  Dept,  of  Civil 
Eng.  and  OR,  Princeton  University,  Princeton,  NJ  08544  (1990). 

[8]  S.  Mehrotra,  “On  the  implementation  of  a  (primal-dual)  interior  point  method”,  Technical 
Report  90-03,  Dept,  of  Ind.  Engineering  and  Management  Sciences,  Northwestern  University, 
Evanston,  IL  (1990). 

[9]  S.  Mehrotra,  J.  Sun, “An  interior  point  algorithm  for  solving  smooth  convex  programs  based 
on  Newton’s  method”,  Technical  Report  88-08,  Dept,  of  Ind.  Engineering  and  Management 
Sciences,  Northwestern  University,  Evanston,  IL  (1988). 

[10]  J.  Mennicken,  “Implementation  of  a  first  order  central  path  following  algorithm  for  solving  large 
linear  programs”,  Report  No.  202,  Schwerpunktprogramm  der  DFG  Anwendungsbezogene 
Optimierung  und  Steuerung,  Institut  fur  Ang.  Math,  und  Statistik,  Universitat  Wurzburg, 
Am  Hubland  (1990). 

[11]  J.E.  Nesterov  A.S.  Nemirovsky,  “A  general  approach  to  polynomial-time  algorithms  design  for 
convex  programming”,  report,  Central  Economical  and  Mathematical  Institute,  USSR  Acad. 
Sci.  (Moscow,  USSR,  1988). 

[12]  J.E.  Nesterov  A.S.  Nemirovsky,  “Self-concordant  functions  and  polynomial-time  methods  in 
convex  programming”,  report,  Central  Economical  and  Mathematical  Institute,  USSR  Acad. 
Sci.  (Moscow,  USSR,  1989). 

[13]  J.  Renegar,  “A  polynomial-time  algorithm  based  on  Newton’s  method  for  linear  programming”, 
Mathematical  Programming  40  (1988)  59-93. 

[14]  G.  Sonnevend,  “An  “analytical  centre”  for  polyhedrons  and  new  classes  of  global  algorithms 
for  linear  (smooth,  convex)  programming”,  Lecture  Notes  of  Control  and  Information  Sciences 
84  (1986)  866-878. 

[15]  G.  Sonnevend,  J.  Stoer,  “Global  ellipsoidal  approximations  and  homotopy  methods  for  solving 
convex  analytic  programs”,  Appl.  Math,  and  Optimization  21  (1989)  139-166. 

[16]  G.  Sonnevend,  J.  Stoer,  G.  Zhao,  “On  the  complexity  of  following  the  central  path  by  linear 
extrapolation  in  linear  programs”,  to  appear  in  U.  Rieder  and  P.  Kleinschmidt  eds.,  Proc.  If 
Symp.  on  Operations  Research  (Ulm  1989). 


27 


REPORT  DOCUMENTATION  PAGE 


Form  4 flprovtd 
OMI  Mo.  070441  tt 


1.  AGENCY  USI  ONLY  (l *4*1  OUnk) 


«.  title  amo  subtitle 


J.  REPORT  TYPE  AMO  OATES  CO VI MO 

Technical  Report 


S.  FUNDING  NUMBERS 


Interior-Point  Methods  for  Convex  Programming 


N00014-90-J-1242 


i  AUTHOR(S) 


Florian  Jarre 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Department  of  Operations  Research  -  SOL 
Stanford  University 
Stanford,  CA  94305-4022 


I.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 


1 1 1 1 MA 


•.  SPONSORING  MONITORING  AGENCY  NAME(S)  ANO  AOORESSIES) 

Office  of  Naval  Research  -  Department  of  the  Navy 
800  N.  Quincy  Street 
Arlington,  VA  22217 


10.  SPONSORING  /  MONITORING 
AGENCY  REPORT  NUMBER 


SOL  90-16 


12«  OlSTRiauTiON  AVAILABILITY  STATEMENT 


12b  DISTRIBUTION  CODE 


UNLIMITED 


1J.  ABSTRACT  (\*4*imum  200  *oraj) 


This  work  is  concerned  with  generalized  convex  progrannming  problems,  where  the  objective  and 
also  the  constraints  belong  to  a  certain  class  of  convex  functions.  It  examines  the  relationship  of  two 
conditions  for  generalized  convex  programming-self-concordance  and  a  relative  Lipschitz  condition-and 
gives  an  outline  for  a  short  and  simple  analysis  of  an  interior  point  method  for  generalized  convex 
programming.  It  generalizes  ellipsoidal  approximations  for  the  feasible  set,  and  in  the  special  case  of  a 
nondegenerate  linear  program  it  establishes  a  uniform  bound  on  the  condition  number  of  the  matrices 
occurring  when  the  iterates  remain  near  the  path  of  centers. 


14.  SUBJECT  TERMS 

convex  program;  ellipsoidal  approximation; 
relative  Lipschitz  condition;  self  concordance. 


IB.  PRICE  COOE 


17.  SECURITY  CLASSIFICATION  IB.  SECURITY  CLASSIFICATION  IB.  SECURITY  CLASSIFICATION  20.  LIMITATION  OF  ABSTRACT 
OP  REPORT  OP  THIS  PAGE  OP  ABSTRACT 

UNCLASSIFIED  SAR 


99  >•»  i  69 


NSN  7S4G-01-2B0  SS00 


»  •* 


